Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve CI #27

Merged
merged 26 commits into from
Feb 6, 2024
Merged

Improve CI #27

merged 26 commits into from
Feb 6, 2024

Conversation

mwaskom
Copy link
Collaborator

@mwaskom mwaskom commented Feb 5, 2024

This sets up CI to run on pull requests against this repo using all three configs that we distribute.

To make CI go (somewhat) fast and be economical, I am configuring it to use some different parameters compared to the base configs. See the new ci/prep_for_ci.py script but namely:

  • Shorter sequence_len so that we can use A100-40GB instances
  • Truncated training data set with only 1000 examples
  • 2 epochs
  • Hardcoded small val_set_size, eval_batch_size, micro_batch_size

Having struggled a bit lot to get the CI to run due to surprising interactions between these variables (especially the validation set sizes) I am somewhat reconsidering whether this is a good idea and if it would instead be preferable to eat slow/expensive CI for the sake of distributing configs that we "know" will work. I think my preferred order of operations is:

  • Merge this with a baseline CI setup
  • Update to the latest version of axolotl, transformers, etc.
  • Revisit the question of exactly what we do in CI, as some of the issues that I have had may be fixed

@mwaskom
Copy link
Collaborator Author

mwaskom commented Feb 6, 2024

I'm going to merge this and then start a new PR where I update axolotl, etc. There are some remaining issues with the configs (e.g., does flash attention really not work with mistral?) but I think it makes more sense to investigate them once we're using the latest versions of things.

@mwaskom mwaskom merged commit 3442b1f into main Feb 6, 2024
3 checks passed
@mwaskom mwaskom deleted the michael/add-ci branch February 6, 2024 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant